---
id: qa-agent-introduction
title: Introduction to RAG QA Agent Evaluation
sidebar_label: Introduction
---

In this tutorial, we'll be showing you how to set up a **comprehensive RAG QA Agent evaluation pipeline** in just a few minutes. While this example focuses on a QA Agent, the concepts and guides presented in this tutorial are important for _anyone building RAG systems_.

:::note
Before we begin, you'll first need to login to Confident AI, where we'll be analyzing our evaluation reports and building our datasets. To do so, run:

```bash
deepeval login
```

:::

We'll be covering everything from generating large synthetic datasets to running evaluations on your QA agent. More specifically, you'll be learning:

- How to [generate a synthetic dataset](#etablish-the-qa-agent) from your knowledge base
- How to [define an evaluation criteria](#etablish-the-qa-agent) for your QA agent
- How to [choose the right metrics](#etablish-the-qa-agent) for evaluating your QA RAG agent
- How to [pull your dataset](#etablish-the-qa-agent) to run evaluations
- How to leverage deepeval to [run evaluations](#etablish-the-qa-agent) and generate test reports
- How to [iterate on your QA agent's hyperparameters](#etablish-the-qa-agent) to improve generation quality
- How to to [catch regressions](#etablish-the-qa-agent) in your systems from hyperparameter changes

# Establish the QA Agent

In this tutorial, we'll be evaluating a QA Agent designed to answer questions about `MadeUpCompany`, a company specializing in data analytics solutions. This QA Agent is a RAG (Retrieval-Augmented Generation) system, meaning it retrieves relevant information from a knowledge base whenever a user submits a query.

The goal of the QA Agent is to provide relevant and factually correct answers to help users better understand MadeUpCompany's products and services, and keep them satisfied.

:::info
In this tutorial, we'll focus on 3 hyperparameters. In practice, you may want to **experiment with additional hyperparameters** depending on the complexity of your system, but the core approach remains the same.
:::

Here are the 3 hyperparameters: we'll be using `gpt-3.5` to power our QA Agent, with a top-k value of 3 for our retriever. Additionally, we'll use the following prompt template:

```python
prompt_template = """You are a helpful QA Agent designed to answer user questions
about a company's products and services. Your goal is to provide accurate, relevant,
and well-structured responses based on the information retrieved from the company's
knowledge base.
"""
```

Unlike other LLM systems, it's much easier to build an evaluation dataset for QA Agents because of the availability of a knowledge base. Synthetic data generation techniques makes it easy possibe to generate a large high quality evaluation dataset in little time. We'll be exploring how to do this through `DeepEval` in the next section.
